DCHTS ODD 
PSYCHOLOGY 



PERSPECTIVE ARTICLE 

published: 18 June 2013 
doi: 10.3389/fpsyg. 2013. 00332 




Comparing apples and pears in studies on magnitude 
estimations 

Mirjam Ebersbach 1 *, Koen Luwel 2 - 3 and Lieven Verschaffel 3 

' Department of Psychology, University of Kassel, Kassel, Germany 

2 Educational Research and Development, Hogeschool-Universiteit Brussel, Brussel, Belgium 

3 Faculty of Psychology and Educational Sciences, Katholieke University Leuven, Leuven, Belgium 



Edited by: 

Andrea Bender, University of 
Freiburg, Germany 

Reviewed by: 

Maria Olkkonen, Rutgers University, 
USA 

Korbinian Moeller, Knowledge 
Media Research Center, Germany 
Samar Zebian, Lebanese American 
University, Lebanon 

'Correspondence: 

Mirjam Ebersbach, Department of 

Developmental Psychology, Institute 

of Psychology, University of Kassel, 

Hollaendische Str. 36-38, D-34121 

Kassel, Germany 

e-mail: mirjam. ebersbach® 

uni-kassel.de 



The present article is concerned with studies on magnitude estimations that strived to 
uncover the underlying mental representation(s) of magnitudes. We point out a number 
of methodological differences and shortcomings that make it difficult drawing general 
conclusions. To solve this problem, we propose a taxonomy by which those studies could 
be classified, taking into account central methodological aspects of magnitude estimation 
tasks. Finally, we suggest perspectives for future research on magnitude estimations, 
which might abandon the hunt for the mathematical model that explains estimations best 
and turn, instead, to investigate the underlying principles of estimations (e.g., strategies) 
and ways of their improvement. 
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INTRODUCTION 

There is an ongoing debate among researchers concerned with 
magnitude estimations on how the relationship between subjec- 
tive estimations and objective magnitudes may be described best. 
This issue is important as poor estimation performance - such as 
in number line estimation tasks - is associated with limited math- 
ematical abilities in children (e.g., Booth and Siegler, 2008; Geary 
et al., 2008). Furthermore, the characteristics of the underly- 
ing mental representation of magnitudes including its systematic 
biases are often directly inferred from the estimations (e.g., Siegler 
and Opfer, 2003). 

Initially, two fundamental models have been proposed on how 
magnitudes might be mentally represented. The logarithmic ruler 
model (Dehaene et al, 1990; Dehaene, 1997) assumes that magni- 
tudes are represented with constant variability on a mental num- 
ber line. However, representations of larger numbers are located 
closer to each other and thus overlap compared to smaller num- 
bers. The accumulator model, in contrast, states that magnitudes 
are represented equidistantly but with proportionally increas- 
ing variability (i.e., scalar variability: Gibbon and Church, 1981; 
Whalen et al., 1999; Huntley-Fenner, 2001). Studies promoting 
the logarithmic model usually employed relative magnitude esti- 
mation tasks such as identifying the larger of two numbers (e.g., 
Dehaene et al., 1990), while other studies used absolute magni- 
tude estimations that required the approximate transformation 
between two magnitudes (e.g., generating 23 key presses without 
counting: Whalen et al, 1999). Only absolute magnitude esti- 
mations will be considered further in this article as only these 
represent estimations in a narrower sense, that is, "a process of 



translating between alternative quantitative representations, at 
least one of which is inexact" (Siegler and Booth, 2005, p. 198). 

Based on the initial accounts, Siegler and colleagues (e.g., Siegler 
and Opfer, 2003) investigated how the estimation pattern of mag- 
nitudes develops using the number line task. Participants usually 
mark the position of given numbers on a number line, ranging for 
instance from 0 to 100 or from 0 to 1000. It has been demonstrated 
that in young children and for relatively large number ranges, 
in particular, the estimation pattern exhibits a logarithmic shape, 
whereas for small number ranges and in older children and adults, 
the pattern is linear and quite exact, without scalar variability. 

This research was the starting point for further studies aim- 
ing to explain typical biases in numerical estimations of children 
and adults. Alternative models to a logarithmic model with con- 
stant variability and a linear model with scalar variability have been 
proposed, that is, segmented linear models (e.g., Ebersbach et al., 
2008; Moeller et al., 2009) or a cyclic power model (e.g., Barth and 
Paladino, 2011). Moreover, a simple power model, adopted from 
psychophysical research (e.g., Stevens, 1957), was put forward to 
describe systematic biases in adults' numerical estimations (e.g., 
Crollen et al., 201 1). A debate has started and is still going on about 
which model is best suited to explain the relationship between esti- 
mations and actual magnitudes (e.g., Barth and Paladino, 2011; 
Opfer et al., 2011; Ashcraft and Moore, 2012; Bouwmeester and 
Verkoeijen, 2012). 

In the present article, we aim at emphasizing that studies 
employing absolute magnitude estimations to investigate the char- 
acteristics of the mental representation of magnitudes are often 
hardly to compare as they involve a broad range of methodological 
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approaches that apparently may lead to different outcomes con- 
cerning the shape, accuracy, or variability of the estimations. To 
address this problem, we propose a taxonomy by which many 
of the conducted (and future) studies may be classified, which 
might help to evaluate the comparability, reliability, and valid- 
ity of studies. Finally, implications for future research will be 
suggested. 

LACKING COMPARABILITY OF THE STUDIES 

Studies involving absolute magnitude estimations differ broadly 
with regard to the tasks, the stimuli, and the methods of analy- 
sis. Hence, even additional studies might provide no further clarity 
on children's and adults' estimation abilities and the nature of 
their underlying mental representations as long as apples and pears 
are collected into the same basket. In the following, we will give 
some examples that are directly related to the taxonomy proposed 
later. 

First, estimations can be conceived as numerical or non- 
numerical (Siegler and Booth, 2005). Numerical estimations 
involve a magnitude in a symbolic format (i.e., a number word 
or a numeral) that has to be transferred approximately into 
another - symbolic or non-symbolic - magnitude (e.g., telling 
the number of dots), or vice versa. This type will be referred 
to as symbolic estimations in the following. Non-numerical - 
or non-symbolic estimations, in contrast, refer to the approxi- 
mate transformation between two non-symbolic magnitudes (e.g., 
reproducing a number of dots by key presses). Crollen et al. 
(2011) showed that symbolic and non-symbolic estimations of 
adults differ both qualitatively and quantitatively. Symbolic esti- 
mations yielded typical biases - that is, under- or overestimations, 
respectively - that could be well described by a power function. 
Non-symbolic estimations (i.e., reproduction task), in contrast, 
were relatively accurate and were described best by a largely linear 
function. 

The differences between symbolic and non-symbolic estima- 
tions might be explained by the assumption of format-dependent 
representations of magnitudes (e.g., Dehaene, 1992; Cohen Kadosh 
et al., 2011; Lyons et al., 2012; for a review see Cohen Kadosh 
et al., 2008), although a format-independent representation has 
been proposed, too (e.g., McCloskey et al., 1985; Barth et al., 2003; 
Walsh, 2003). Evidence for format-dependent representations 
comes from fMRI measures showing that different formats acti- 
vate distinct brain regions (e.g., Vogel et al., 2013). Furthermore, 
Roggeman et al. (2007) provided evidence that (at least small) sym- 
bolic magnitudes are mentally represented by place codes, that is, as 
activation of a specific position on the mental number line, corre- 
sponding to the target magnitude. Non-symbolic magnitudes, in 
contrast, are represented by summation codes, that is, as activation 
of a whole segment of the number line up to the corresponding 
position of the target magnitude. Place codes reflect a local and 
thus more precise activation on the number line than summa- 
tion codes and might thus explain a higher accuracy of symbolic 
compared to non-symbolic estimations. Furthermore, it has been 
assumed that different transformation paths exist between dis- 
tinct representational codes (e.g., bi-directional mapping model: 
Castronovo and Seron, 2007) that might differently affect chil- 
dren's estimations, in particular, whose number knowledge is not 



fully developed yet. They might perform poorer in symbolic esti- 
mations that require the comprehension or production of number 
symbols, compared to non-symbolic estimations. Evidence for this 
assumption stems from children's magnitude comparisons (see 
Rousselle and Noel, 2007) as well as from differential effects of 
language characteristics on number line estimations (Helmreich 
et al., 2011). However, most of the studies so far that strived at 
examining the mental representation of magnitudes involved only 
symbolic estimations, which is in particular true for research with 
children (for an exception see Mejias et al., 2012). It might be 
worthwhile to directly compare the performance in symbolic and 
non-symbolic estimations and to relate it to the symbolic num- 
ber knowledge. Sasanguie et al. (2012) have for instance found 
that children's performance in both a symbolic and a non-symbolic 
number line task were highly correlated but that only the symbolic 
task performance was associated with math performance-even if 
controlled for non-symbolic task performance (see also Sasanguie 
et al, 2013). 

Furthermore, different types of tasks were used within sym- 
bolic estimations, such as position-to-number tasks (or percep- 
tion tasks), where symbolic numbers have to be assigned to 
given non-symbolic magnitudes (e.g., Ashcraft and Moore, 2012) 
and number-to-position tasks (or production tasks), where non- 
symbolic magnitudes have to be generated that match given sym- 
bolic numbers (e.g., Barth and Paladino, 2011). Crollen et al. 
(201 1 ) have shown that both tasks yield opposing biases (i.e., over- 
estimations in the production task and underestimations in the 
perception task) and different error rates in adults. A poorer perfor- 
mance in a production-like task compared to a perception-like task 
was also reported for children (Mundy and Gilmore, 2009; Mejias 
et al, 2012). 

In addition, the target stimuli to be estimated differed. 
Continuous stimuli, such as in the number line paradigm (e.g., 
Siegler and Opfer, 2003), and discrete stimuli (e.g., numbers of 
dots, Crollen et al., 2011) have been used. Boyer et al. (2008) 
showed that children perform better in proportional judgments 
of liquids when they were presented as continuous amounts than 
by discrete units-probably as discrete magnitudes allured them to 
apply counterproductive counting mechanisms and suppressed a 
more intuitive approach. Moreover, children were also more accu- 
rate in comparing continuous (i.e., lengths of bars) than discrete 
magnitudes (i.e., numbers of dots; Barth et al., 2009). 

Taken together, research so far has shown that the estimation 
type (i.e., symbolic vs. non-symbolic), task type (i.e., perception, 
production), and target type (continuous vs. discrete) might dif- 
ferently affect the shape and accuracy of magnitude estimations as 
well as the direction of the biases in terms of under- and overesti- 
mations. The next two issues refer to the variability and, again, to 
the shape of the estimations and, relatedly, to the inferred shape of 
the underlying mental representation of magnitudes. 

Magnitude estimations can be bounded (e.g., number line 
tasks with lower and upper anchor points: Siegler and Opfer, 
2003) or unbounded with no upper anchor cue (e.g., Booth and 
Siegler, 2006, Exp. 1; Whalen et al., 1999; Cohen and Blanc- 
Goldhammer, 2011). This issue is relevant in particular for the 
question of whether or not the estimations exhibit the signa- 
ture of scalar variability. It seems likely that only unbounded 
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tasks with no upper anchor cue would yield scalar variabil- 
ity as they do not allow for adjusting large estimations to an 
upper limit (Ebersbach et al., 2008). Furthermore, young chil- 
dren who lack an understanding of large numbers might fail to 
use the upper numerical anchor and their estimations thus might 
exhibit scalar variability, too, in cases where the upper anchor is 
unfamiliar. 

Moreover, within the bounded number line tasks, many stud- 
ies provided a lower and an upper anchor, such as 0 or 1 and 
100 (e.g., Siegler and Opfer, 2003; Siegler and Booth, 2004; Booth 
and Siegler, 2006; Laski and Siegler, 2007; Opfer and Siegler, 2007; 
Ebersbach et al., 2008; Opfer and Thompson, 2008; Thompson and 
Opfer, 2008; Ashcraft and Moore, 2012; Ebersbach, in press), while 
in other studies the location of an additional reference point was 
explicitly referred to in the pre-test (e.g., the location of 50 on a 
number line of 0-100; Barth and Paladino, 201 1; Bouwmeester and 
Verkoeijen, 2012; Slusser et al., 2013). The explicit indication of an 
additional reference point might have affected the shape of the esti- 
mations and facilitated the calibration of the estimations around 
the additional reference point. As a result, estimations might be 
best described by a cyclic power model with relative accurate esti- 
mations near the reference points, while the absence of a third 
reference point might rather yield a better fit with a logarithmic 
model. 

METHODOLOGICAL TAXONOMY 

So far, we illustrated methodological differences between studies 
that might account for the often heterogeneous findings concern- 
ing the shape, variability, and accuracy of magnitude estimations. 
To solve this shortcoming, we propose a taxonomy into which each 
of the used paradigms might be classified (see Figure 1). This tax- 
onomy accounts for ( 1 ) the question of whether symbolic numerals 
(or number words) are involved in the estimations or not (i.e., 
symbolic vs. non-symbolic estimations), (2) the type of the esti- 
mation tasks (i.e., perception, production, reproduction), (3) the 
type of the target stimuli (i.e., discrete vs. continuous), (4) the 
potential range of the estimations (i.e., bounded vs. unbounded), 
and (5) whether an additional reference point was provided or 
not. For instance, a classical number line paradigm (e.g., Siegler 
and Opfer, 2003) involves symbolic/numerical estimations in terms 
of a production task, in which the target stimuli are continuous, 



the estimation range is bounded by anchors and no additional 
reference point is provided. 

SUGGESTIONS FOR FUTURE RESEARCH 

First of all, it could be useful to systematically manipulate the 
methodological aspects proposed in the taxonomy in magnitude 
estimation tasks. This might allow determining if previous findings 
concerning the shape, variability, and accuracy of magnitude esti- 
mations and their underlying mental representation, as well as the 
emergence of systematic estimation skills in the course of develop- 
ment are generalizable or if they apply only to certain paradigms. 
If the latter was true, the different paradigms might tap different 
mental representations or, perhaps a certain paradigm might be no 
reliable and valid instrument to investigate the underlying mental 
representation (cf. Moeller and Nuerk, 2011). 

Second, when aiming at identifying the mathematical model 
that explains magnitude estimations best, one has to take into 
account that model fits are affected, amongst others, by the num- 
ber of trials, the number of parameters of the model, the question of 
whether a constant is estimated or not, and by the intra-individual 
variability of the estimations, which is relatively large in young chil- 
dren, in particular. One way to account for the model errors as well 
as for the number of free parameters would be using the Akaike 
information criterion (AIC), though it refers only to the relative fit 
of alternative models. 

Third, shortcomings in deciding which model describes the 
shape of estimations of individual participants best should be pre- 
vented. Previous approaches largely differed, ranging from infer- 
ential statistics (i.e., comparing adjusted _R 2 values or the absolute 
values of the residuals of each model by f-tests: Siegler and Opfer, 
2003; Moeller et al., 2009; Berteletti et al., 2012; though it can- 
not even be assumed that these parameters are Gaussian: May 
et al., 1989; Edgington and Onghena, 2007) to pure descriptive 
accounts (i.e., comparing _R 2 values or likelihoods of each model by 
visual inspection: Thompson and Opfer, 2008; Barth and Paladino, 
2011; Cohen and Blanc-Goldhammer, 2011; Ashcraft and Moore, 
2012). A descriptive approach provides maximally a heuristic but 
not a reliable decision rule (Glover and Dixon, 2004). Even slight 
differences between concurrent model fits might be overvalued 
if estimations were, for instance, classified as being linear only 
because the fit of a linear model is R 2 = 0.731 and that of an 
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FIGURE 1 | Taxonomy of paradigms of studies on magnitude estimations. 
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alternative model is R = 0.730. In this regard, a statistically based 
classification method seems necessary to avoid arbitrary results 
(see also Moeller and Nuerk, 2011; Bouwmeester and Verkoeijen, 
2012). In addition, if individual data will be analyzed in future 
research, a larger number of trials should be used to get more reli- 
able data - although it also needs to be considered that a multitude 
of trials might reduce participants' motivation and yield interfering 
learning effects. 

A fourth remark refers to the assumption, made implicitly or 
explicitly, that linear relationships between estimations and actual 
magnitudes, as reflected for instance by a better fit of a linear 
model (e.g., Siegler and Opfer, 2003), are the "idealized develop- 
mental endpoint of numerical estimation" (Ashcraft and Moore, 
2012, p. 256; see also Hollands et al., 2002). Even if estimations 
rather obey a linear function, they might significantly and sys- 
tematically deviate from the actual values, depending on the slope 
and intercept of the fitted linear function. Thus, even if equidis- 
tance between neighboring numbers is assumed, the estimations 
might deviate fundamentally from the actual values (cf. Moeller 
and Nuerk, 201 1). In turn, estimations that are better explained by 
a power or logarithmic model might correspond on average bet- 
ter to the actual magnitudes than a linear model. However, the use 
of the best-fitting function might be questioned if the functions 
make similar predictions with respect to the observable estima- 
tion behavior (cf. Wagenaar, 1975; Dehaene, 2001; Thompson and 
Opfer, 2008; Cantlon et al, 2009). It thus might be useful to con- 
sider not only the shape of the estimations but also the accuracy in 
terms of both absolute and simple deviations from the actual val- 
ues, as well as the variability of the estimations (see also Holloway 
and Ansari, 2008; White and Szucs, 2012). 

Given the current state of research, future research might rather 
focus on conditions that lead to biased magnitude estimations 
and on how these estimations and the underlying "number sense" 
(Dehaene, 1997) might be improved. First attempts were already 



provided in the field of developmental research, where number 
games have proven to support equidistance in the mental repre- 
sentation of numbers (e.g., Wilson et al., 2006; Siegler and Ramani, 
2008; Whyte and Bull, 2008). Other approaches might include pro- 
moting the familiarity with (e.g., Ebersbach et al., 2008) and the 
embodiment of numbers (e.g., Fischer et al., 2011) as potential 
precursors of an appropriate representation of the number system. 
Furthermore, the nature of estimation processes might be inspected 
further, such as when and how anchor cues are used or internally 
created-in particular in the course of development (see Schneider 
et al., 2008; Ashcraft and Moore, 2012; White and Szucs, 2012). 
It has been shown that adults use anchors to adjust their numer- 
ical estimations (Izard and Dehaene, 2008), but studies on whether 
children are able to do so and how their use of anchors might 
be affected (e.g., number knowledge, working memory) are rare 
(for exceptions see Newman and Berger, 1984; Petitto, 1990). Thus, 
the question of how and which estimation strategies are applied 
should be addressed. In addition, as strategies affect estimations 
(e.g., Ashcraft and Moore, 2012) one might question the funda- 
mental assumption underlying the use of estimation paradigms, 
namely that estimations are a probate instrument to tap the under- 
lying mental representation at all (Gescheider, 1998; Moeller and 
Nuerk, 2011). To sum up, we put forward a taxonomy that might 
contribute to a better comparability of studies on absolute magni- 
tude estimations. We propose that the research focus might switch 
from trying to identify the model that describes the estimations best 
toward conditions and strategies that lead to estimation biases and 
toward procedures that might ward off these biases. 
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